Single Pass SVM using Minimum Enclosing Ball of Streaming Data
نویسندگان
چکیده
We present a stream algorithm for large scale classification (in the context of l2-SVM) by leveraging connections between learning and computational geometry. The stream model [1] imposes the constraint that only a single pass over the data is allowed. We study the streaming model for the problem of binary classification with SVMs and propose a single pass SVM algorithm based on the minimum enclosing ball (MEB) of streaming data [2]. We show that the MEB updates for the streaming case can be adapted to learn the SVM weight vector using simple Perceptron-like update equations. Our algorithm performs polylogarithmic computation at each example, requires very small and constant storage (O(D) where D is the dimensionality of input space). Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other state-of-the-art SVM solvers. The 2-class l2-SVM [3] is defined by a hypothesis f(x) = w φ(x), and a training set consisting of N points {zn = (xn, yn)}n=1 with yn ∈ {−1, 1} and xn ∈ R. The only difference between the l2-SVM and the standard SVM is that the penalty term has the form (C ∑
منابع مشابه
Streamed Learning: One-Pass SVMs
We present a streaming model for large-scale classification (in the context of l2-SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The l2-SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on th...
متن کاملAccurate Streaming Support Vector Machines
A widely-used tool for binary classification is the Support Vector Machine (SVM), a supervised learning technique that finds the “maximum margin” linear separator between the two classes. While SVMs have been well studied in the batch (offline) setting, there is considerably less work on the streaming (online) setting, which requires only a single pass over the data using sub-linear space. Exis...
متن کاملA Simple Streaming Algorithm for Minimum Enclosing Balls
We analyze an extremely simple approximation algorithm for computing the minimum enclosing ball (or the 1-center) of a set of points in high dimensions. We prove that this algorithm computes a 3/2-factor approximation in any dimension using minimum space in just one pass over the data points.
متن کاملStreaming and Dynamic Algorithms for Minimum Enclosing Balls in High Dimensions
At SODA’10, Agarwal and Sharathkumar presented a streaming algorithm for approximating the minimum enclosing ball of a set of points in d-dimensional Euclidean space. Their algorithm requires one pass, uses O(d) space, and was shown to have approximation factor at most (1 + √ 3)/2+ ≈ 1.3661. We prove that the same algorithm has approximation factor less than 1.22, which brings us much closer to...
متن کاملSupport vector machine classification for large data sets via minimum enclosing ball clustering
Support vector machine (SVM) is a powerful technique for data classification. Despite of its good theoretic foundations and high classification accuracy, normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is highly dependent on the size of data set. This paper presents a novel SVM classification approach for large data sets by using minimum ...
متن کامل